Graph-Based Clustering for Computational Linguistics: A Survey
نویسندگان
چکیده
In this survey we overview graph-based clustering and its applications in computational linguistics. We summarize graph-based clustering as a five-part story: hypothesis, modeling, measure, algorithm and evaluation. We then survey three typical NLP problems in which graph-based clustering approaches have been successfully applied. Finally, we comment on the strengths and weaknesses of graph-based clustering and envision that graph-based clustering is a promising solution for some emerging NLP problems.
منابع مشابه
Evolutionary Graph Clustering using Graph and Cluster Mixtures
Many networks and accordingly their representation in graphs are subject to structural changes during the course of their existence [CKT06]. Examples for such evolutionary networks include friendship networks in online communities, co-authorship networks in the scientific domain and collocation networks in computational linguistics. Studying the evolution of such networks can provide vital data...
متن کاملA partition-based algorithm for clustering large-scale software systems
Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...
متن کاملAn Application of Operational Research to Computational Linguistics: Word Ambiguity
This paper draws on graph theory and optimization techniques to develop a new measure of word ambiguity (e.g., homonymy and polysemy) for use in psycholinguistic research. This measure provides information regarding the uncertainty of the intended meaning of English words. Specifically, data about sixty-four thousand distinct words was collected from a corpus of close to three hundred million w...
متن کاملA Graph-Based Clustering Approach to Identify Cell Populations in Single-Cell RNA Sequencing Data
Introduction: The emergence of single-cell RNA-sequencing (scRNA-seq) technology has provided new information about the structure of cells, and provided data with very high resolution of the expression of different genes for each cell at a single time. One of the main uses of scRNA-seq is data clustering based on expressed genes, which sometimes leads to the detection of rare cell populations. ...
متن کاملAssessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories
In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010